Using Term Sense to Improve Language Modeling Approach to Genomic IR
نویسندگان
چکیده
Genomic IR, characterized by its highly specific information need, severe synonym and polysemy problem, long term name and rapid growing literature size, is challenging IR community. In this paper, we are focused on addressing the synonym and polysemy issue under the language modeling framework. Unlike the ways translation model and traditional query expansion techniques approach to this issue, we incorporate term sense into the basic language model, a more fundamental approach to the synonym and polysemy issue in IR. The sense approach not only maintains the simplicity of language models, but also makes the document ranking efficient and effective. A comparative experiment on the TREC 2004 Genomic Track data shows significant improvement of retrieval performance after incorporating the term sense into a basic language model. The MAP (mean average precision) is significantly raised from 29.17% (the baseline system) to 36.94%. The performance of the sense approach is also significantly superior to the mean (21.72%) of official runs participated in TREC 2004 Genomic Track and is comparable to the best work (40.75%) of the track. Most runs in the track extensively use various query expansion and pseudo relevance feedback techniques while our approach does nothing except the incorporation of term sense, which evidences the view that semantic smoothing, i.e. the incorporation of synonym and sense information into the language models, is a more standard approach to achieving the effects traditional query expansion and pseudo-relevance feedback techniques are working for.
منابع مشابه
Using Concept-Based Indexing to Improve Language Modeling Approach to Genomic IR
Genomic IR, characterized by its highly specific information need, severe synonym and polysemy problem, long term name and rapid growing literature size, is challenging IR community. In this paper, we are focused on addressing the synonym and polysemy issue within the language model framework. Unlike the ways translation model and traditional query expansion techniques approach this issue, we i...
متن کاملWord Sense Disambiguation Improves Information Retrieval
Previous research has conflicting conclusions on whether word sense disambiguation (WSD) systems can improve information retrieval (IR) performance. In this paper, we propose a method to estimate sense distributions for short queries. Together with the senses predicted for words in documents, we propose a novel approach to incorporate word senses into the language modeling approach to IR and al...
متن کاملMulti-lingual Indexing Support for CLIR using Language Modeling
An indexing model is the heart of an Information Retrieval (IR) system. Data structures such as term based inverted indices have proved to be very effective for IR using vector space retrieval models. However, when functional aspects of such models were tested, it was soon felt that better relevance models were required to more accurately compute the relevance of a document towards a query. It ...
متن کاملModeling the Relationship between Sense of Place, Social Capital and Tourism Support
The success of tourism development heavily depends on residents’ support. Broader literature suggests sense of place and social capital are important precedents of residents’ attitudes and behavior. However, limited attention has been paid to this topic in tourism, especially in Iran. Therefore, the aim of this research is to examine the relationship between sense of place, social capital and r...
متن کاملDevelopment and Validation of Teacher Emotional Support Scale: a structural equation modeling approach
Reviewing the literature indicated that no validated model was found that examine the extent to which teachers support their students emotionally in EFL classrooms. Therefore the present study elaborated on this issue through developing and validating a teacher emotional support scale in an Iranian English foreign language context. Main components of the scale have been specified based on Hamre...
متن کامل